An Efficient Algorithm for Mining High Utility Contiguous Patterns from Software Executing Traces
نویسندگان
چکیده
Software behavior pattern mining has important significance since it can provide help for software engineers to maintain the correctness of software and detect exceptions as soon as possible. These high utility software behavior patterns shed light on software behavior and capture unique characteristic of software traces. In this paper, we propose a novel approach HUCP-Miner (high utility contiguous pattern mining) to mine high utility contiguous patterns from the software executing traces. First of all, this work presents a maximum utility measure which is used to simplify the utility calculation for contiguous patterns. Second, we propose a novel structure called UL-list (utility and location list) to store utility and location information of patterns which contributes to backward extension. Based on UL-list, a remaining utility upper bound model (ruub) and extension strategy are put forward to prune the unpromising patterns early. Finally, an extensive experimental study with different real-life datasets shows that the proposed algorithm has impressive performance.
منابع مشابه
High Fuzzy Utility Based Frequent Patterns Mining Approach for Mobile Web Services Sequences
Nowadays high fuzzy utility based pattern mining is an emerging topic in data mining. It refers to discover all patterns having a high utility meeting a user-specified minimum high utility threshold. It comprises extracting patterns which are highly accessed in mobile web service sequences. Different from the traditional fuzzy approach, high fuzzy utility mining considers not only counts of mob...
متن کاملA New Algorithm for High Average-utility Itemset Mining
High utility itemset mining (HUIM) is a new emerging field in data mining which has gained growing interest due to its various applications. The goal of this problem is to discover all itemsets whose utility exceeds minimum threshold. The basic HUIM problem does not consider length of itemsets in its utility measurement and utility values tend to become higher for itemsets containing more items...
متن کاملData sanitization in association rule mining based on impact factor
Data sanitization is a process that is used to promote the sharing of transactional databases among organizations and businesses, it alleviates concerns for individuals and organizations regarding the disclosure of sensitive patterns. It transforms the source database into a released database so that counterparts cannot discover the sensitive patterns and so data confidentiality is preserved ag...
متن کاملEfficient Mining of High Utility Sequential Patterns Over Data Streams
High utility sequential pattern mining has emerged as an important topic in data mining. Although several preliminary works have been conducted on this topic, the existing studies mainly focus on mining high utility sequential patterns (HUSPs) in static databases and do not consider the streaming data. Mining HUSPs over data streams is very desirable for many applications. However, addressing t...
متن کاملSpecification Mining for Digital Circuits with Applications on Verification and Diagnosis
Software and hardware systems are often built without detailed documentation. The correctness of these systems can only be verified as well as the specifications are written. The lack of sufficient specifications often leads to misses of critical bugs, design re-spins, and time-to-market slips. In this paper, we address this problem by mining specification dynamically from simulation traces. Gi...
متن کامل